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GENERAL NOTES 


The anniversary of the founding of the Center for Documentation and 
Communication Research (CDCR) in 1955 by Dean Jesse H. Shera was not for- 
mally celebrated at Western Reserve, even though the ’’holes" from cards 
and paper tape processed over that 10-year period would have provided 
more than enough confetti. A deeper satisfaction came from two other 
facts. 

First, the level of research activity at the Center hit an all-time 
high in 1964-1965, measured both by the number of projects and by the 
necessary outside financial support, gratefully received from a number of 
agencies. Most recently, the Center has announced the completion of a 
contract with the National Science Foundation to investigate the varia- 
bility of human relevance assessments in relation to document searching. 
Director of the project is Asst. Professor Alan M. Rees; his co-principal 
investigator is Dr. Douglas G. Schultz of Western Reserve’s Department 
of Psychology. The $94,588.00 contract is for a two-year period. 

Second, the addition of a disc-storage unit to the GE-225 complex 
made it possible for the Center's Computer Laboratory to make available 
its services and cooperation to an ever-increasing number of users both 
in the University and in the community at large. Examples of the latter 
include the American Cancer Society, the Cystic Fibrosis Foundation, etc. 
A full second-shift operation was inaugurated by Asst. Professor Robert 
Lyle Jacobs, Manager of the Computer Laboratory. He organized the first 
of a series of two-week Fortran courses, intended to familiarize faculty 
and staff with computer capabilities. The courses were taught by Irene 
Reineks, head of the programming staff, and Michael Smith. 
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Among the numerous publications deriving from the Center’s educational 

and research programs, two are worthy of separate note here: 

The Education of Science Information Personnel — 

1964 . Proceedings of an Invitational Conference. 

A. J- Goldwyn and Alan M. Rees, eds. (Available 
by request to the Center at $4.00 per copy.*) 

A Selected Bibliography : Autumn, 1965. (Available 
by request to the Center without charge.) 

A feV comments under the general heading of **Personnel” may be of 

Interest, although space must restrict these to those who have travelled 

the longest distance to join the staff: 

George Ember, formerly of the Ins ti tut de 
Medicine et Chirurgie experimentales , Uni-^ 
versite de Montreal, joins the Center’s 
staff as Lecturer and Research Assistant. 

He is co-author (with Hans Selye) of the 
most recent edition of Symbolic Shorthand 
System for Physiology and Medicine . 

Dr. Madugula I. Sastri^ who worked at the 

Center during 1961-1962 while completing 
work on his doctorate, has returned as a 
full-time Research Assistant (Structural 
Linguist). During the intervening period 
he taught at Andhras University, in India, 

F. W. Harwood of the University of Tasmania 
is another welcome exotic returnee, in this 
case to temporary duty as a consultant to 
the NSF **automatic classification** project. 

Travel highlights during the past year included Jessica Melton’s week- 
long institute in **New Methods of Information Retrieval,** presented at the 
Institute of Engineering, University of Mexico, and A. J. Goldwyn* s parti- 
cipation as lecturer in a two-week NATO Advanced Study Seminar in The Hague 
the Netherlands, Record mileage was achieved by Alan Rees, in the course 
of numerous trips throughout the East and Middle West as panelist, speaker 


*The publication of this book was made possible through the generosity of 
the Lubrizol Foundation and its Secretary, Mr. Harry L. Jackson. In- 

come derived from its sale will help to defray the costs of the Con- 
ference itself. 
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or consultant. And by the summer of 1965, Dean Shera returned to travel 
status, recovered from a series of physical misadventures. 

Since there are 35 full-time staff members (before the manning of the 
new NSF project), space at 10831 Magnolia Drive continues to be a problem. 
Extensive remodelling of the carriage house has provided office space for 
the programmers that is conveniently close to the computer. But the "main 
house" Is still straining at the seams, and expected further expansion of 
the education and research programs threatens even more serious crowding. 

A. J. Goldwyn 
Executive Director 
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CENTER ACTIVITIES 

I. Testing and Evaluation 

A, Development of a Documentation Facility for the Health 
Sciences, National Institutes of Health Grant No. FR- 
00118-02. Principal Investigator: A. J. Goldwyn. 

(References 7, 13, 14, 18, 20, 21, 23, 26, 28) 

Largest in scope and central to much of the CDCR*s program because 
it focuses within itself so many of our research concerns is the Compara- 
tive Systems Laboratory (CSL) , supported now for the third year by the 
Division of Research Facilities and Resources, National Institutes of 
Health. 

The second year of PHS Grant FR-00118 has continued the work of the 
Comparative Systems Laboratory (CSL) at the Center for Documentation and 
Communication Research (CDCR, as described in the Progress Report for 
FR-00118-01. The principal tasks remain: 

1. to determine at what point, and under what conditions, the 
performance of an information retrieval system is optimized; 
and 

2o to establish the comparative performance of a number of sys- 
tem components, and then of a number of systems, under con- 
trolled conditions. 

Major activities have continued to be the formulation of methodology, 
the preparation of files for testing, and the establishment of user groups. 
As this implementation has moved into the second year, major emphasis has 
been on the documentation of effort (16 Internal Reports have been prepared 
some of which will serve as material for general publications); on the pre- 
paration of supporting computer programs (28 programs have been prepared 
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specifically for this project); and on the working out of the physical modes 
of dealing with the user group (apparently trivial but actually complex, 
both mechanically and psychologically)* 

As indicated in the report for FR-00118-01, a good deal of time and 
effort have gone into the development of question analysis as it relates to 
search strategy, and of search strategy itself. These efforts were aided 
during part of the grant year by the full-time services of Dr. I. M. Whitwam, 
who has since returned to medical practice in England, as well as by the 
permanent staff. Aiding also in the development of methodology was a **pre- 
pilot** user, Dr. Kenneth S. Warren, Asst. Professor of Medicine, WRU, a 
researcher in schistosomiasis, who formulated questions to the file and 
evaluated the automatic output provided to him by the system(s) . This con- 
tact was developed at the same time that arrangements with the main pilot 
group (arranged through the Communicable Disease Center [CDC] ) yielded 20 
users who submitted a total of 135 questions. Each of these is being 
searched, using each of the separately-processed subfiles. (Indexing 
languages, as indicated earlier, include telegraphic abstracts, key words 
[derived both manually and by the computer], meta-language based on titles, 
and the conventional index entries produced routinely for manual use by 
the editors of the Tropical Disease Bulletin . Coding of terminology 
approached completion with some 7,000 revised encoded entries. Every 
index term on tape can be searched either as an independent English term 
or as an encoded representation thereof, with associated thesaural values.) 

A somewhat separate effort was directed toward the evaluation of 
indexer qualifications, and variations in efficiency associated therewith. 
Psychological, educational, and professional profiles of a group of 13 
indexers were compiled, and are being correlated with results of a com- 
parison of their work with that done by regular full-time professionals. 
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The effects of incentives (pay variations, etc.) are also being accumulated. 

Actual interrogation of the file began in April, 1965, and is continuing 
through the series of more than 1,000 separate requests derived from ex- 
haustive analysis of the 135 questions received so far. The guidelines 
developed through the first two years* study will now be "operationally" 
tested, and the hoped-for rigidity of the CSL techniques put under fire. 
Measurements developed under FR-00118-01 will be applied, and it is hoped 
that the third year will see the publication of preliminary results. Also 
planned are meticulous statistical appraisal of the data, reappraisal of 
methodology, and preparation of full-text files as the basis for further 
advanced study. Finally, we would hope during the next year to begin to 
develop an approach to the full-scale testing of an already operating 
information-retrieval system for the medical literature or some part 

thereof. _ _ 

Among the contacts with present or potential users established by the 
CSL, in addition to the CDC and the (London) School of Hygiene and Tropical 
Medicine, was the Arthropod-Borne Virus Information Exchange. Public 
presentations of material related to or based on the CSL and explicitly 
credited thereto have been made at the American Documentation Institute 
annual meeting (October, 1964), the National Science Foundation "Study 
Conference on Evaluation of Document Searching Systems and Procedures" 
(October, 1964), the NATO Advanced Study Institute on the Evaluation of 
Information Retrieval Systems (July, 1965), and the Drexel Conference on 
Technical Information Center Administration (June, 1965), among others. 

Tefko Saracevic, Instructor in Library Science, is Manager of the 
project, aided by Mrs. Marilyn Bobka, 



B. Search Strategy. Air Force Office of Scientific Research- 
Grant No« AF-AF0SR“403-65 « Principal Investigator: William 
Goff man. 

(References 5^ 8^ 95>1G) ' 


Reversing the more orthodox process of putting theory into practice^ 
information science has too , of ten been characterized by a somewhat aimless 
search for theory to back up or justify already-^established practice. At 
the CDCR, the investigation of the theoretical aspects of information 
retrieval has been a cornerstone of the research program for a number of 
years . ^ The' AFOSR project ^ under the ac.tive .direction of Dr . William, Goffmanj, 
has made an ever^-stronger contribution to both educational and research 
activities^ particularly in the area of experimental design and evaluation ■ 
techniques. 

Able assistance up to the Spring of 1965 was given by Dr,. Vaun A. Newill 
who left at that time for a tour of PHS duty in Japan. Dr, A, D, Booth, 
whose appointment to the faculty was announced last year , is now active in 
the advising of doctoral candidates and in research design, as well as in 
the teaching of information retrieval theory, 

AFOSR activities have included: 

1, A mathematical model has been developed, in which epidemic 

theory is applied to the information spread within a. popu- ,y. 
lation. Tills model is an extension of the mathematical 
model of an information retrieval process which was con- 
structed. during the previous year/ 

A communication process can be looked at as an epidemic 
process in which there is embedded an information retrieval 
process which functions as an instrument for providing 
■ effective contact between the susceptibles (users.) and 
infectives (file) , . ■ . 

With this approach, it becomes , possible to consider a 
communication process in terms of an optimal control 
problem. For example, it is possible to determine the 
precise point in time at which it is necessary to intro- 
duce an iiiforraation retrieval system into, a given popu- 
lation of research workers, 
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2, A series of experiments was designed for the purpose of 
testing the theory. The first experiment consisted of 
a comparative "study of the effectiveness of an expert, 
a machine and chance in searching titles in a sub-area 
of the field of diabetes. This study is the first phase 
of a set of experiments to determine whether an auto- 
matic device can perform a complex task such as searching 
literature by means of a sequence of simple tasks each 
one of which it can accomplish as well as a human. The 
nature of these tasks would be determined by applying 

a theory in which the physical process has been reduced 
to its elements and in which the fundamental relation- 
ships between these elements and their essential proper- 
ties have been established, 

3. The continued development and application of a methodology 
for comparative studies of information retrieval systems. 


C. Empirical Study of Relevance Assessments to Document 
Searching. National Science Foundation. Contract No. 
NSF-C-423. Principal Investigator: Alan M. Rees. 


Project is described on the first page (General Notes) of this 

Newsletter. 


II. ’^^Sys terns Approaches** 

When this heading first appeared in last yearns Newsletter, it had some 
value if not of novelty at least of freshness. During the past year, the 
“systems approach*’ has become a vogue expression -- like so many others, 
in the brief history of documentation. The CDCR continues to believe that 
the system is the sum of its parts, and to consider the system for each 
spedific application as a complex of peculiarly appropriate subsystems or 
components. Purpose , in this definition of systems study, is as important 
as function . With this philosophy, we hope to avoid the all-purpose approach 
to systems qua systems, which proceeds from the assumption that a document- 


retrieval system is not only like a neural network, for example, or an inter- 


urban traffic exchange pattern, but is in fact the same thing. Our library 
base, in effect, becomes clear. 

Thus, in each project listed below, a specific purpose has generated 
the guidelines for project development. These projects are, in a sense, 
task-oriented, although this does not preclude generalizable results. The 
**systems approach** to each has meant the identification and accomplishment 
of sets o’f tasks and sub-tasks oriented to a specific goal. 

A. An Operating Test of a Pilot Educational Media Research 
Information Center. U. S. Office of Education. Title 
VII Project B-170b. Principal Investigator: A. J. Goldwyn. 

(References 1, 2, 3, 4, 29) 

Research and development activities related to the establishment of an 
Educational Media Research Information Center (EtIRIC) were concluded in June 

of 1965. The CDCR's current research for the USOE is now focused on the 

ick ^ - 

preparation of a thesaurus of education terms. The primary purpose of 

the thesaurus will be to provide a basic indexing language which can be 
utilized both for generalized and specialized purposes. One immediate 
application upon completion will be its incorporation into the coordinate 
indexing system proposed by the USOE*s Educational Research Information 
Center (ERIC). Beyond this, the standardized indexing language it will 
provide should improve communications within and between developing educa- 
tion information centers. 

The first task in thesaurus preparation has been the identification of 
significant terras covering all aspects and areas of education. An important 


*A final report will be available for distribution in late fall. Results 

of pilot searching will appear in the fall 1965 issue of AV Communication 

Review, 

•k-k 


Initiated in February, 1965. 
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source of such terms has been the Semantic Code Dictionary of Education 
developed and maintained at the CDCR, which represents the accretion of 
index terms assigned to articles and reports over the past four years. 

Terms have been gathered from other sources -- the index to the Encyclopedia 
of Education Research » titles of Cooperative Research reports^ etc./ and 
incorporated into the thesaurus as candiate entries. 

Preliminary criteria have been developed for inclusion/exclusion of 
terms. A core list of some 5,000 terms has been accumulated, which reflects 
both deletions and additions to the original list as provided by the Semantic 
Code Dictionary. Candidate terms have been sorted into categories which 
are analogous to the facets of a faceted classification. The CDCR is in- 
corporating the basic elements of a faceted classification into the thesaurus 
because it believes such an arrangement offers several advantages not found 
in more conventional approaches . Among these adv ant ag es ar e; 

1. A more usable and more effective display of hierarchical, 
synonymous and collateral relationships among terms and 
sets of terms « 

2. More consistency in term selection for indexing and searching 
by providing a more rigidly structured vocabulary. 

3. A reduction in the number of RT*s (related terms) for each 
individual entry in the thesaurus, thereby facilitating 
indexing and searching. 

Certain parallels exist between the development of a thesaurus of educa- 
tion terms and the development of thesauri by the Engineers Joint Council 
(EJC) and the American Petroleum Institute (API). These are being explored 
with information experts closely associated with the EJC and API. Manager 
of the various educational research information projects is Gordon C. Barhydt, 
aided by Charles T. Schmidt. Associated with Gordon C. Barhydt in the pre- 
paration of the thesaurus of education-related terms is Alan M. Rees. 



T> 
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Autonuitlc Procevssiag of Metallurgical Abstracts for the 
Purpose of laforniation Retrieval, National Science Foun- 
datloa. Grant No, NSF-GN-3()3, Principal Investigator: 
Je s s i ca S , Me 1 1 on , 

(References 15 j, 16 17) 


Research is continuing on the development and testing of methods for 
processing previously generated abstracts for information retrieval. The 
aim of the research, more specifically ^ is to automate subject indexing, 
perhaps the most difficult and expensive task of information systems when 
performed manually. 

Briefly the procedure is as follows. Abstracts from the metallur* 
gical section of Chemical Abstracts are punched on Flexowriter tape and 
converted to magnetic tape, preserving the typographical characteristics 
of the printed text. 

Computer procedures are designed for processing the abstracts on a 
level above straight dictionary look~up v/hile avoiding total xinguistic 
analysis. Metallurgical terms in the text are located and grouped accor* 
ding to a strict subject indexing rationale* A corpus of 768 abstracts 
(100,000 words of running text) has been analyzed by the computer tech- 
niques developed for this research. These are on magnetic tape in search- 

ready form. 

The project has now reached the testing phase. Real questions sub- 
mitted to an operational information, retrieval service (The American 
Society for Metals Documentation Service) are being addressed to the 
automated system. The answers are compared to those which ASM actually 

J 

sent to their subscribers in response to the questions. Results of these 
test searches will be used to refine and evaluate the system. 

Manager of the project is Celeste Hespen, aided by Dr, M, I. Sastri 


as Structural Linguist, 



Co Manipulation of Autopsy Diagnoses by Computer Technique, 
Cooperative Project; GDCR and WRU Institute of Pathology. 
Project Directors; J. Chandler Smith, M, D. , and Jessica 
S, Melton, 

(References 25 26, 27, 28) 

A fully automated information retrieval system for autopsy records, 
utilizing the original record (punched on paper tape) as computer input, 
has been developed for the Institute of Pathology of Western Reserve Uni- 
versity, Preliminary work leading to this system is partly supported 
within the framework of the Comparative Systems Laboratory, 

The project has generated an unusual amount of interest, particularly 
after the appearance of the JAMA article (28) . The system is currently 
being tested by the Armed Forces Institute of Pathology (AFIP) which 
compiles autopsy and surgical data from all Armed Forces Hospitals and 
all Veterans Administration Hospitals, Approximately 50,000 autopsy re- 
ports are received, coded, and filed by the AFIP each year. The National 
Laboratory of the Atomic Energy Commission at Oak Ridge, Tennessee is 
also testing the system for possible use in large-scale animal experiments; 
the Computer Laboratory at the CDCR is assisting in the pilot phase. The 
potential of the method has been recognized in academic centers; the 
Universities Associated for Research and Education in Pathology, Inc., 
has held the first of a series of meetings in Washington to consider 
applications of the method, 

D, Diabetes Literature Retrieval. Joint project with the 
University of Minnesota and the University of Rochester 
under the sponsorship of the National Institute of 
Athritis and Metabolic Diseases, Grant No. AM 06399-04. 
Principal Investigator (for WRU): A, J, Goldwyn, 

A portion of the Center® s work in cooperation with the American Diabet 
Association project, described in previous Newsletters, resulted in the pre 




paration and publication of a title and key « word index of the diabetes- 
related citations in Index Medicus for 1962, Similar indexes for 1960 
and 1961 had been completed previously. Preparation of citations for 1963 
is now in process, and it is expected that those for 1964 will follow 
shortly thereafter. 

At an August meeting at the Marine Biological Laboratory, Woods Hole, 
Massachusetts, plans were completed for the testing of an experimental 
current awareness service. A thesaurus of diabetes-related terminology 
has been developed at Minnesota to a preliminary stage. At WRU, many of 
the computer programs for searching the literature have been written^ 
potential users have been contacted, and work begun on preparation of 
user profiles. Of particular interest in this connection is the coordina- 
tion of the Center's selection and manipulation of diabetes-related citations 
with the accession program and the computer-based bibliographic work of 
the National Library of Medicine, Alan M, Rees, LaVahn Overmyer and 
Robert L, Jacobs have collaborated in the direction of this project. 

S, Library Mechanization Feasibility Study 

LaVahn Overmyer, Asst. Director of the CDCR, has completed a study to 
determine the feasibility of mechanizing certain routines within the (WRU) 
University Libraries, Dr. Lyon Richardson, Director of the University 
Libraries, and Asst, Professor Overmyer have submitted the report to 
President John S, Millis of Western Reserve in the form of a proposal for 
a pilot project. It is hoped in this way to provide both a laboratory and 
a demonstration project, from the point of view of our training program — 
while at the same time making a significant contribution to the service 
value of the library complex. 
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III. Educational Programs 

(References 11* 12, 19, 24) 


The documentation and information retrieval curriculum of the School 
of Library Science of Western Reserve University is closely associated with 
the research activities of the Center. The merging of an instructional 
situatiori with a research environment has contributed to the widening of the 
students® base of experience and knowledge. 

The above makes possible intensive instruction in basic principles; the 
assignment of problem-solving tasks y designed to give students experience in 
system operation and development; and the opportunity for participation in 
the research programs at the Center.. It may be of interest to repeat the 
description of three courses added last year to the "basic*® series in docu- 
mentation and information retrieval « These are Information Retrieval Systems 
II y Automation of Library Processes and Procedures, and Introduction to 
Information Retrieval Theory. 

Information Retrieval Systems II 

Experience is provided with respect to the operation of an 
information retrieval system* Component parts of a total system 
are analyzed, such as acquisition, indexing, file arrangement, 
question analysis, search strategy and evaluation of outputs to 
illustrate their interaction* Practical experience is given for 
each sub-system* Each student is required to index a number of 
documents utilizing several indexing languages. Questions are 
assigned for analysis and searching, to explore the matching of 
questions and indexing languages. Students become familiar with 
the manipulation of a number of storage media (hand-sorted punched 
cards, magnetic tape, peek-a-boo cards, etc*). Search results 
are analyzed and tabulated by students and are related to indexing 
decisions, question analysis and search strategy used. The course, 
Library Science 574, has been developed and taught by the staff, 
including Mr* Saracevic, Lecturer in Library Science. 

Automation of Library Processes and Procedures 

This course is planned to survey and evaluate the possible 
uses of data processing equipment within the traditional library 
functions — administration, acquisitions, catalog production, 
circulation, intercommunication, etc. Punch cards, computers, 
micro-records, photography, and visuals are discussed; comparative 
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costs are considered; current library installations are reviewed. 
The course, Library Science 572, has been developed and taught by 
Asst. Professor Overmyer, 

Introduction to Information Retrieval Theory . 

An elementary treatment of certain mathematical tools needed 
in the construction of abstract theories and models in the field 
of information retrieval. Applications of these tools to the de- 
sign and evaluation of retrieval systems is discussed. The course, 
Library Science 577, has been developed and taught by Dr. Goffman 
and Dr. Booth, An earlier version in the form of a lecture series 
was presented by Robert A. Fairthorne. 


LS. 

524 

LS. 

540 

LS. 

572 

LS . 

573 

LS. 

574 

LS. 

575 

LS. 

576 

LS. 

577 

LS. 

578 

LS. 

580 


Courses; Academic Year 1965-66 

Documentation 

Theory of Classification 

Automation of Library Processes and Procedures 
Information Retrieval Systems I 
Information Retrieval Systems II 
Information Processing on Computers 
Automatic Language Processing 
Introduction to Information Retrieval Theory 
Specialized Information Centers and Services 
Research in Information Retrieval 


Special Note . A recent revision of ^'core” courses required of all those 
matriculating in the degree programs of the School of Library Science has 
rendered obsolete earlier printed descriptions of curricula. This change 
will be of particular interest to those who wish to specialize in the Docu 
mentation or Information-Science sequence. 

For further information, please address the Director of 
Admission, Admission Office, Western Reserve University, Cleveland, Ohio 
44106, specifically designating the **new** master’s or doctoral program. 
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